Step 1b added: run BOTH gates before claiming Goal-Lx PASS.
- Gate 1: `winml config` diff against shipped recipe (strip `_note`).
- Gate 2: `winml build` baseline on main without `-c`.
If both gates show parity, the recipe is catalog-only — do not file.
Audit on 2026-06-23 found 6 of 6 recent recipe PRs (#933 #934 #943
#944 #945 #946) had zero CLI-surface delta over auto-config output.
All 6 closed; replacement = user runs `winml build -m <id>` direct.
SKILL.md additions:
- Step 0 Effort L0/L0★ guardrail
- Step 1b full procedure with verdict table
- Goal-axis guardrail (Lx evidence requires Step 1b real-delta)
- Step 4b trigger #8 (catalog-only escape) + next-id bump to 039
findings.json: _meta-038 with refines [_meta-013, _meta-018],
mechanism_confirmed=true, evidence cites the 6-PR audit.
PR: Helsinki-NLP/opus-mt-en-ru — translation recipe pair (fp32, CPU) — Goal-L2-encoder closed
Iter: 6 (composite recipe pair shipped iter-5 as marian-003; this PR adds the Goal-L2-encoder + L1-CPU evidence on top)
Producer: main agent (2026-06-23)
Claimed tier:
(Effort = L0★, Goal = L2-encoder, Outcome = L0)Summary
This PR ships the
Helsinki-NLP/opus-mt-en-rutranslation recipe pair (encoder + decoder). It is the FIRST seq2seq composite pair contributed to the recipe catalog, and the first Marian-family entry. The recipe was generated viawinml config --task translation(per_meta-020composite-expansion gate); both halves build cleanly on CPU at fp32. Goal-L1-CPU PASSes on both halves; Goal-L2 cosine = 1.000000 on the encoder (PT-vs-ONNX). Goal-L2 on the decoder isDEFERRED-HARNESSper_meta-018— see verdict table. No source-code changes.Per
_meta-020, encoder + decoder ship as ONE PR with a per-half verdict matrix.1. Recipe files
Note on filename:
fp16_*is cosmetic per_meta-014—quant: nullmeans fp32 weights ship.winml perfcorrectly reportsModel Precision: fp32(see L1-CPU evidence below). The cosmetic filename is retained for catalog consistency.2. README index row
examples/recipes/README.md — row to add for
Helsinki-NLP/opus-mt-en-ru | translation | composite (encoder + decoder) | recipe pair.3. Build output directory + artifact inventory
temp/marian_build/{encoder,decoder}/(gitignored — referenced by path for reviewer re-execution):model.onnxanalyze_result.jsonexport_htp_metadata.jsonwinml_build_config.jsonmodel.onnxanalyze_result.jsonexport_htp_metadata.jsonwinml_build_config.jsonExternal-data layout check (
_meta-023): both halves under 2GB ProtoBuf limit ⇒ inline weights, no.datashard. N/A — vacuous PASS.Encoder/decoder cross-attention alias check (
_meta-025): encoder output =encoder_hidden_states(shape[1,512,512]); decoder inputencoder_hidden_states(shape[1,512,512]). Direct name + shape match. PASS.4. Build log
Build logs at
temp/marian_build/{encoder,decoder}/build.log(per marian-003 mechanism_notes). Iter-6 reused iter-5 artifacts unchanged — recipe is byte-identical to the marian-003 commit; no re-build needed.5. Appended findings
Per-model —
model_knowledge/marian.json_meta-020, encoder alias_meta-025, external-data_meta-023,--ep-optionsretry_meta-026, task-consistency_meta-028).Skill-meta —
skill_meta/findings.jsonThis PR does not introduce new
_meta-NNNfindings. The iter-6 methodology evolution (_meta-019..037) ships separately on the skills branch (Lane A per_meta-033).6. Optimum-coverage probe verdict
Verdict: VENDOR-COVERED on
text2text-generation(composite expansion → encoder = feature-extraction, decoder = text2text-generation). Effort L0★ confirmed. Perwinml config --task translation, the user-facing tasktranslationcorrectly composite-expands to the two sub-tasks; the decoder recipe'stask: text2text-generationis the canonical sub-task name per_meta-028.7. Claimed (Effort, Goal, Outcome) tier
winml configinvocation per checkpoint, no hand-edits beyond_statusremoval which was never needed here)_meta-018)winml.eval.compare_pt_onnxhelper" is captured under marian-005 gotchas but is methodology-scope)8. Goal-ladder verdict table (per
_meta-018, per-half per_meta-020)winml build→model.onnx; opset 17; fp32 weights per_meta-014; structural validation viaonnx.load[1, 512]input. Log: temp/opus_en_ru_perf_enc_cpu.log_meta-016— same host caveat as bart-mnli_meta-015—winml evaltask registry does not includetranslation(no generative-text-to-text task)winml build→model.onnx; opset 17; fp32 weights; structural validation viaonnx.load[1, 1]decoder_input_ids +[1, 512, 512]encoder_hidden_states + 6×past_KV pairs. Log: temp/opus_en_ru_perf_dec_cpu.log_meta-016_meta-018— needs proper DynamicCache↔past_KV reconstruction (open feature gap noted in marian-005). Log: temp/en_ru_l2_compare.log_meta-015Short-circuit honored: no FAIL anywhere. L3 CLI-BLOCKED + L2-decoder DEFERRED-HARNESS do not halt the march per
_meta-018. The honest ceiling is L2-encoder PASS.Diligence ladder (
_meta-037): not invoked — no BLOCKED-style verdict required ladder walk; the two BLOCKED verdicts (L1-non-CPU + L3) are host/CLI capability gaps documented in existing findings, not failed attempts.9. Methodology-evolution declaration (per
_meta-031)No NEW methodology friction in this PR. The composite-recipe pattern +
task=translationrouting + decoder L2 harness gap were all captured during iter-5 (marian-003..005); they ship as separate_meta-NNNfindings on the skills branch under_meta-019..030. Triggers:_meta-025).DEFERRED-HARNESSwas new during iter-5 but is now in the vocabulary._meta-019..030already shipped.Reviewer should confirm "no methodology friction observed" rather than REQUEST_CHANGES on absence per
_meta-031anti-trigger.Reviewer hand-off package — Step 6 9-item self-check